Model Selection

Multi-task distillation

# Multi-task distillation

Deepseek R1 Distill Qwen 32B Unsloth Bnb 4bit

DeepSeek-R1 is the first-generation inference model launched by the DeepSeek team. Through large-scale reinforcement learning training, it does not require supervised fine-tuning (SFT) as an initial step and demonstrates excellent inference capabilities.

Large Language Model

Transformers English

Xtremedistil L12 H384 Uncased

XtremeDistilTransformers is a task-agnostic Transformer model distilled through task transfer learning, creating a small universal model applicable to any task and language.

Large Language Model

Transformers English

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase